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Abstract 

A sink-free orientation of a finite undirected graph is a choice of orientation 
for each edge such that every vertex has out-degree at least 1. Bubley and Dyer 
(1997) use Markov Chain Monte Carlo to sample approximately from the uniform 
distribution on sink-free orientations in time 0(m 3 log(l/e)), where m is the number 
of edges and e the degree of approximation. Huber (1998) uses coupling from the 
past to obtain an exact sample in time 0(m 4 ). We present a simple randomized 
algorithm inspired by Wilson's cycle popping method which obtains an exact sample 
in mean time at most 0(nm), where n is the number of vertices. 

1. Introduction 

A common problem is to select a random sample efficiently from a large collection of 
combinatorial objects. There are many reasons one may wish to do this. One is to obtain 
an approximate count: Jerrum and Sinclair [JS] showed that if one can generate nearly 
uniform samples, then for each e > 0, one can obtain the cardinality of the collection to 
within a factor of 1 + e with probability 1 — e, in just a little more time. When counting 
the collection is #P-hard, as in the case of properly /c-coloring a graph, this may be the 
only reasonable way to count, since it is unlikely that #P-hard counting problems can 
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be solved exactly in polynomial time. Another reason to seek a sampling algorithm is 
that it may shed light on properties of the typical sample. For example, the analysis 
of typical spanning trees of Cayley graphs [P, BLPS] relies on two algorithms, the first 
developed by Aldous [A] and Broder [B] and the second by Wilson [W]. The analysis of 
phase boundaries in typical domino tilings of regions known as Aztec diamonds also relies 
on a sampling algorithm, known as domino shuffling [CEP]. Finally, sample generation 
may be a way of producing conjectures about the typical sample via simulation, when no 
theorem is known (for example, the results in [CEP] were initially discovered this way). 

One common way to generate samples is Markov Chain Monte Carlo (MCMC). Here 
one finds an ergodic Markov chain whose equilibrium measure is the desired distribution 

then one runs the chain until the distribution is close to ji. Constructing such a chain 
is usually easy (often when \x is uniform, there is a natural doubly stochastic transition 
matrix) and the hard part is knowing how long to run it. This may be established via 
eigenvalue bounds, or via coupling arguments or stopping times. In cases where the 
time bounds on the chain are established via coupling, it is often possible to improve on 
MCMC by using coupling from the past (CFTP) to obtain an exact sample rather than 
an approximate one [PW1]. 

In this note we consider the generation of a random sink-free orientation (SFO) of a 
finite undirected graph. Sink- free orientations were introduced by Bubley and Dyer [BD], 
who were motivated by an equivalence between counting them and counting satisfying 
assignments of Boolean formulas in conjunctive normal form in which each variable occurs 
at most twice (they call this problem Twice-SAT). Bubley and Dyer showed that counting 
sink-free orientations is #P-complete, so it is unlikely that an exact count can be obtained 
in polynomial time, and we must use approximate counting techniques based on nearly 
uniform sampling. 

Bubley and Dyer give an MCMC algorithm that produces a sample whose distribution 
is within e of uniform (in total variation) in time 0(m 3 log(l/e)), where m is the number 
of edges. Huber [H] uses Bubley and Dyer's analysis along with CFTP to produce an 
exact uniform sample in mean time 0(m A ). The purpose of this note is to improve the 
running time to 0(nm), where n is the number of vertices. Instead of MCMC, we use a 
strong uniform time algorithm inspired by David Wilson's cycle popping algorithm [W] 
for generating uniform directed spanning trees. 

We now describe the problem and our results more precisely. Let G = (V, E) be a 
finite undirected graph. We allow multiple edges and self-loops (but at most one self-loop 
per vertex, since multiple self-loops play no useful role in sink- free orientations). We 
define an n-cycle to be a ring of n vertices v o, . . . , v n _i with edges from Vi to v i+ i for each 
% (taken modulo n; note that a 1-cycle is a vertex with a self-loop), and an n-lollipop to 
be a path consisting of n vertices and n — 1 edges, with a self-loop added at one end. 

An orientation of an edge between vertices v and w is a mapping of the set {head, tail} 
onto {v,w}. Thus, a self-loop has only one orientation, but all other edges have two. To 
reverse the orientation of an edge, swap its head and tail. An orientation of G is an 
orientation of each edge. A sink in an orientation is a vertex that is not the tail of any 
edge (a source is the opposite, i.e., not the head of any edge), and a sink-free orientation 
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(SFO) of G is an orientation that contains no sinks. If any connected component of G is 
a tree, then G has no SFO, and vice versa. Henceforth we restrict consideration to the 
class S of graphs in which no component is a tree. Let [Iq denote the probability measure 
assigning probability 1/N to each SFO of G, where N is the total number of SFO's of G. 

Our algorithm, which we call "sink popping," works as follows. Given a graph, orient 
the edges by independent, fair coin flips. If this orientation has no sinks, then it is the 
SFO we seek. Otherwise, choose any sink, and randomly re-orient each edge that points 
into the sink (i.e., all of its edges). We call this popping the sink, for reasons that will 
become clearer in the next section. Repeat until there are no more sinks. 

We now state our main result. 

Theorem 1.1. For every graph G G S, sink popping terminates in finite time with 
probability 1, regardless of how one chooses which sink to pop, and produces an output 
whose distribution is precisely The average number of sinks that must be popped is 
at most (™) , where n is the number of vertices of G, once again regardless of how one 
chooses which sink to pop. Equality holds only for the n-cycle or the n-lollipop. The 
expected number of times each particular vertex is popped is at most n — 1 . 

Sink popping is briefly mentioned at the end of [PW2], where the claims in the first 
sentence of the theorem are mentioned without detailed proof (and the running time is 
not analyzed). 

To show that sink popping's running time is 0(nm), we need to state the algorithm 
slightly more carefully. The subtle point is avoiding spending lots of time searching for 
sinks. We will keep a list of all sinks in the graph, and also a table showing the out- 
degree of every vertex. Generating these initially takes time proportional to the sum of 
the degrees of the vertices, or 0(m) time. Whenever we search for a sink to pop, we 
simply take the first sink from the list. When we pop the sink, we update the table to 
reflect the changes to its out-degree, and to those of its neighbors. It neighbors may have 
become sinks, in which case we append them to the list of sinks. (The purpose of the 
table is to let us easily see whether the neighbors have become sinks, without having to 
examine all their edges: in a complete graph, that would waste lots of time.) No sink can 
be annihilated except the one we popped, since no two sinks can share a common edge (it 
would have to point to both). Thus, each time we pop a sink at v, re-orienting its edges 
and updating the list and table requires time 0(deg(t>)). By Theorem 1.1, the expected 
number of times v is popped is 0(n), so the total expected number of operations is 



This time bound does not actually estimate the number of bit operations, but instead 
treats individual graph operations as units. 

In the next section we give another description of the sink popping algorithm and 
explain its connection to cycle popping. We also state some further results about sink 
popping with arbitrary initial conditions. The third section contains proofs of the diamond 
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and strong uniformity lemmas, which are analogous to the equivalent lemmas for cycle 
popping. The fourth section analyzes the running time. The fifth section derives some 
further facts about the running time. We conclude with some speculations and open 
questions. 

2. Sink popping and cycle popping 

Let H = (V, E) be a finite, connected, directed graph, and let v be any fixed vertex 
of H. A directed spanning tree of (if, v) is a subset of edges so that every vertex other 
than v has out-degree 1 and v has out-degree 0. Wilson [W, PW2] invented the following 
algorithm, known as "cycle popping," for generating a uniform random directed spanning 
tree. For each w G V \ {v} and k > 0, let X w>k be a random edge leading out of w, 
chosen uniformly from among all edges leading out of w. Let these be independent as w 
and k vary. For fixed w, imagine the collection {X w>k '■ k > 0} as a stack with X w<0 on 
top. Initially, look at the collection {X w p : w G V \ {v}}, that is, consider the collection 
{X w j( U) )} with / = 0. If these form a directed spanning tree, stop and set the sample 
equal to it. If not, there must be a cycle in this collection. Choose a cycle (it doesn't 
matter which), and increment f(w) by 1 for each w in the cycle. (Imagine popping these 
edges off the stack so the next element of each stack is now on top.) If the collection 
{X w j( w ) : w G V \ {v}} is now a directed spanning tree, stop and return this for your 
sample, otherwise continue popping until you do stop. Wilson showed that the set of cycles 
popped does not depend on which you choose to pop when you have a choice, and that 
the algorithm stops almost surely at a directed spanning tree with uniform distribution. 

We can describe sink popping in similar terms, which will be useful in the proof of 
Theorem 1.1. Let G = {V,E) be a finite undirected graph in S and let Q = Qq, where 
flo is the set of orientations of G, i.e., Q consists of sequences of orientations of G. We 
endow Q with the a- field T generated by the coordinate functions X e>k for e G E{G) and 
k > 0, which specify the orientation of e in the fc-th orientation in the sequence. Endow 
fl with the probability measure P under which the coordinate functions are independent 
and each equally likely to yield either orientation. The intuition is that {X e ^ : k > 0} 
represents a stack of arrows under the edge e. Define a random function / :£xN^N 
as follows. Let /(e, 0) = for all e. Given /(e, k) for all e, define /(•, k + 1) inductively: 
If the collection {X e j^ k ) : e G E} is an SFO, then set f(e, k + 1) = f(e, k) for all e. If 
not, choose a sink Vk G V arbitrarily, i.e., a vertex v k for which all edges e incident to 
it are oriented toward it by the orientation X e j( e ^y Let f(e, k + 1) = f(e, k) for e not 
incident to v k , and /(e, k + 1) = /(e, k) + 1 for e incident to v k - The dependence of / on 
the choice rule (for choosing v k , if there are several sinks) is suppressed in the notation, 
as is the dependence on the choice of uo G f2 via the variables {X eyk }. Intuitively, f(e, k) 
is the original depth of the arrow under e now at the top of the stack at time k. Say that 
v k is the sink popped at time k, and let r = min{A; : {X e j( e ^} is an SFO} be the number 
of pops before an SFO is obtained (conceivably r = oo), and r\ = rj{u, choice rule) denote 
the resulting SFO (if any). 
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Except for its last sentence, Theorem 1.1 is established by showing that r < oo with 
probability 1, the law of {X e j( e T )} is precisely /iq, and Er < ( 2 l ), with equality in and 
only in the cases indicated. The first two lemmas are analogous to those used by Wilson 
[W] in establishing the validity of the cycle popping algorithm, and the third is the running 
time analysis. 

Lemma 2.2 (Diamond lemma). The number of pops r < oo in a maximal popping 
sequence is independent of the choice rule, as is the multiset {vk : < k < r}. If r < oo 
then the resulting SFO n is also independent of the choice rule. 

The name "diamond" is meant to remind the reader that moving from the top of a 
diamond to the bottom by going southeast then southwest is equivalent to going southwest 
then southeast. This terminology comes from the article [E]. 

Lemma 2.3 (Strong uniform time). Let N be the number of SFO 's of G. Then for 
each k > 0, and each SFO i], 

P(r = k,{X eJ(e , T) } = r ] ) = ^^. 
In other words, r is a strong uniform time. 

Lemma 2.4. If G e S has n vertices, then Er < (™); with equality only for the n-cycle 
and the n-lollipop. 

We conclude this section by stating two results that shed further light on the running 
time of the popping algorithm. 

Proposition 2.5. The distribution of r for the n-cycle is exactly the same as the distri- 
bution for the n-lollipop. 

Proposition 2.6. Let G G S be any graph with n vertices and let JF be the a-field 
generated by the variables {X ej0 }. Then the conditional mean running time E(r | JF ) is 
always bounded by n(n — 1), and the only case to achieve this is an n-lollipop with all 
edges oriented opposite to their orientation in the unique SFO. 

3. Strong uniformity 

We first establish deterministic facts holding for every sample uj G Vt. Say that a 
sequence v , . . . , Vk-i with k < oo is a maximal popping sequence for uj if it is legal (i.e., 
only sinks are popped) and cannot be extended to larger k (thus if k < oo it results in 
an SFO). Note that if k = oo, we do not mean our notation to suggest that ii ,tii, ... is 
followed by a final term foo-i; instead, t>o, . . • , f oo-i denotes the infinite sequence vo, v i, . . . , 
with no final term. 

Let f(e,k) denote the function f(e,k,cu,v) where a; is a sample point and v is a 
specified legal sequence of pops of length at least k. Define an equivalence relation on 
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finite sequences v , . . . , v k -i of vertices of G by calling two sequences equivalent if one 
can be changed to the other by a sequence of transpositions of pairs of vertices (i>j,i>i+i) 
that are not neighbors in G. (Note that such a transposition does not change whether a 
sequence is a legal popping sequence.) The following lemma is useful, though obvious. 

Lemma 3.7 (Deterministic strong Markov property). Given an integer j and ver- 
tices vq, . . . , Vj-i, let uj be any initial configuration for which vq, . . . , Vj-i is a legal popping 
sequence and let uj' and uj be related by 

X etk {uj') = X ejk+f ( e j)(u;). 

That is, uj' looks like uj after v , . . . are popped. Then the following deterministic 

strong Markov property (DSMP) holds. For any k < oo, the set of sequences {vj +i : < 
i < k — 1} for which v , . . . ,Vj +k -i is a legal popping sequence for uj is the same as the 
set of legal popping sequences of length k for uj'. If v , . . . , Vj +k -i is maximal for uj then 
Vj, . . . , Vj+k-i is maximal for uj' , and leaves the same SFO (if k < oo). 

Extend the definition of equivalence to infinite sequences by saying that vq,vi, . . . is 
equivalent to wq, w\, . . . if, by a sequence of transpositions as above applied to vq, v\, . . . , 
one can transform v o, v i, . . . so that arbitrarily long initial segments of it match those of 
wo,wi, .... In particular, this implies that the multisets {vk} and {wk} are the same. 

Let 1(uj) denote the minimal length of a maximal popping sequence for uj. 

Lemma 3.8. The set of maximal popping sequences is an equivalence class. 

Proof. Let vq, . . . , v k -i be a legal popping sequence for uj with k < oo, and let wq, ■ ■ ■ , Wk-i 
be obtained from Vq, . . . , v k -i by transposing Vi and t>j +i which are not neighbors in G. 
Suppose i — 0. Since the edges incident to v are disjoint from the edges incident to v±, we 
may apply the DSMP to v , v 1 and to v i, v and see that w , . . . , w k -i is legal as well and 
maximal if v , . . . , v k -\ is. If i > 0, first apply the DSMP to v , . . . , v^i and then use the 
same argument. This shows that equivalent sequences are either both maximal popping 
sequences or neither. (The case of k = oo is trivial, since infinite popping sequences are 
automatically maximal.) 

To prove the lemma, we induct on l{yj), and then deal with the case of 1(uj) = oo. 
It is clear when I = 0. Assuming the lemma for 1(uj) < L, let l{yj) = L with maximal 
popping sequence v o, . . . , wl-i- Let Wq, . . . , Wk-i be any other maximal popping sequence. 
If Wq = Vq, applying the DSMP and the induction hypothesis completes the induction. If 
not, then consider the least i for which V{ = Wq, if any. When we pop Vq, . . . ,Vl-i, the 
orientation {X e j( e j) : e G E} has a sink at Wq for each j < i, since the sink at Wq exists 
until one of its edges is popped and no other sink can contain any such edge until Wq is 
popped. Thus % exists and Vj cannot be a neighbor of Vi for j < i. Hence, we can move 
Vi to the first position by a sequence of adjacent transpositions with non-neighbors in G. 
We have seen that the resulting sequence Vi, vq, v\, f 2 , . . . , i>i-i, fi+i, • • • , vl-i is a maximal 
popping sequence. Now apply the DSMP and the induction hypothesis to conclude that 
Wq, . . . , Wk-i is equivalent to Vi, V\, . . . Vi-\, fj+i, . . . , Vl-i, and thus to Vq, . . . , Vl-i- 
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All that remains is the case of = oo, i.e., the case when all maximal popping 
sequences are infinite. Given two such sequences Vq, i>i, . . . and Wq, wi, . . . , the argument 
from the previous paragraph shows that we can transform Vq,Vi,... so that its first 
element is wq. Now applying the DSMP shows that we can bring arbitrarily long initial 
segments into agreement, which is the definition of equivalence. □ 

Proof of the diamond lemma. From the previous lemma, we know that r = I, so r is 
independent of the choice rule. Furthermore, since all maximal popping sequences are 
equivalent, the multisets of popped vertices are the same. The assertion about SFO's 
follows because the SFO depends only on which vertices were popped. □ 

Proof of strong uniform time. We prove by induction that for any SFO r\ and any finite 
sequence Vo, . . . , i>fc_i, the following event has probability 2 _m ~£*=o de soK) ) where deg 
means the degree not counting self-loops: r = k, and vq, ...,Vk-i is a legal popping 
sequence for u, and r)(u) = i] . This is vacuously true when k — 0. Now the probability 
that the singleton v is a legal pop is 2 _deg °^°), so applying the DSMP we see that the 
probability of a maximal popping sequence vq, v i, . . . , Vk-i with r\ = rj is 

2 - dc so Oo ) 2 _m -£ ti 1 dc §o ( v i ) 

which completes the induction. 

To find the probability of both r = k and r) — rjo (with no restrictions on the popping 
sequence), we must sum this probability over all equivalence classes of potential popping 
sequences of length k. We sum over equivalence classes to avoid double counting, since 
for any given u>, Lemma 3.8 tells us that the set of maximal popping sequences is an 
equivalence class. Since neither the summand nor the set of sequences depends on i] , we 
have proved the lemma. □ 

4. Analysis of the running time 

We still have not shown that r is almost surely finite. While this may appear obvious 
from some kind of Markov property, the choice rule makes things sticky and we find it 
easiest to conclude this from the existence of a finite upper bound on the expected run 
time. To bound r we make repeated use of the following monotonicity principle. We let 
Q{G,v) denote the random number of times v is popped in a maximal popping sequence 
(possibly oo), which, by the diamond lemma, is well defined. 

Lemma 4.9 (Monotonicity). Fix G G S and let H e S be a subgraph of G, that is, 
V(H) C V(G) and E(H) C E{G). For v £ V(H), 

EQ(H,v) > EQ(G». 

Proof. This is proved by stochastic domination: we run sink popping simultaneously on 
H and G, using the same stacks for edges common to both graphs. Every legal popping 
sequence on G restricts to a legal popping sequence on H as well, so under this coupling 
Q(H,v) > Q(G,v) always. □ 
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Remark: Additionally, we see that equality occurs only when no SFO on G can require 
further popping of v on H. 

Proposition 4.10. If G is an n-cycle, then Er = (™) . Furthermore, conditioned on 
starting with j edges oriented clockwise and n — j counterclockwise, the expected value of 
t is 2j(n-j). 

Proof. At any time, some of the arrows point clockwise and others counterclockwise. Let 
Yfc be the number of arrows pointing clockwise at time k. Popping at any vertex causes 
two opposite pointing arrows to be replaced by two random arrows. Thus Yfc+i has the 
distribution of Y k + Z where P(Z = 1) = P(Z = -1) = 1/4 and P(Z = 0) = 1/2. 
Therefore {Y k : k > 0} is a simple random walk with delay probability of 1/2 absorbed at 
and n. The expected absorption time from j is twice that for simple random walk, and 
thus is 2j(n — j); see equation (3.5) on page 349 of Feller [F]. Hence Er = 2EYo(^ — Yo), 
which is twice the expected number of ordered pairs of edges where the first is initially 
clockwise and the second initially counterclockwise. There are n(n — 1) ordered pairs of 
distinct edges, each having these orientations with probability 1/4, so Er = n(n— 1)/2. □ 

Corollary 4.11. Let So denote the class of graphs in which every vertex is in some cycle. 
For G G S andve V{G), 

EQ(G,v) < (n-l)/2. 
Equality holds for all v if and only if G is an n-cycle. 

Proof. Fix G and v and let H be a cycle containing v. By monotonicity, EQ(G, v) < 
EQ(H, v) which is at most (n — 1)/2 by Proposition 4.10 and symmetry. In fact, EQ(H, v) 
is strictly less than (n — l)/2 unless H is an n-cycle. By the remark following the proof 
of the monotonicity lemma, the inequality EQ(G,v) < EQ(H,v) is strict unless no SFO 
on G can require further popping of v on H . In our case, H is an n-cycle, and G is an 
n-cycle with some chords or self-loops added. Then (assuming G is not an n-cycle), there 
is always an SFO on G that does not restrict to an SFO on H: if G has a self-loop, one 
can choose an SFO on G such that H has a sink there; if G has a chord, one can use the 
chord to create a short circuit across H giving a cycle of length less than n, orient this 
cycle in a loop, and orient the other edges in G towards the cycle. If v is a sink in the 
restriction to H of such an SFO on G, then strict inequality holds for v (because with 
positive probability, sink popping on G will produce this SFO, and v will still need to be 
popped in if). □ 

Lemma 4.12. For every G G S with n vertices, and each v G V(G), 

EQ(G,v) < n- 1, 

and equality holds only when v is the vertex furthest from the self-loop in an n-lollipop. 

Proof. We induct on G. The base step is G G So, which is immediate from the previous 
corollary. Assume for induction that the conclusion holds for all subgraphs of G. There 
are three cases other than the base step. 
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Case 1. G is not connected. Then the result follows from the induction hypothesis 
and the monotonicity lemma applied to the component H of G containing v. Equality 
never occurs. 

If G is connected and not in So, then G must contain an isthmus, i.e., an edge whose 
removal disconnects G. 

Case 2. Some edge e disconnects G into two components both in S. Again the result 
follows from the induction hypothesis applied to the component H of G \ {e} containing 
v, and equality never occurs. 

Case 3. G has an isthmus and removal of any isthmus always leaves a component 
that is a tree. Then G has a leaf z. If v ^ z then the result follows immediately from 
monotonicity with H = G\{z}. If v is the only leaf, then let w be its neighbor. Choose a 
popping order that pops v whenever possible, and otherwise executes any choice rule for 
sink popping on H := G \ {v}. Initially there is a 1/2 chance that v is a sink, in which 
case it is popped a mean 2 geometric number of times until the edge vw points to w. 
Then, each time w is popped, the probability is 1/2 that this edge is reversed, in which 
case it takes another mean 2 geometric number of pops to reverse it again. Thus 

EQ(G,v) = l + EQ(H,w). 

By induction, this is at most 1 + (n — 2). Equality occurs for v in G if and only if it occurs 
for w in H, so we see by induction that it holds only at the end of a lollipop. □ 

Proof of Lemma 2.4- We prove the lemma by induction, following the pattern of the last 
proof. The base step is G G So, in which case the lemma follows from Corollary 4.11. In 
the cases 1 and 2 of the induction, if G is disconnected or the union of two graphs in S 
along an added edge, the result is again immediate from the subadditivity of the function 
n i— > (™) and monotonicity. Finally, if G has a leaf v, we set H := G\ {v} and observe that 
the number of pops tq and r H on G and H respectively are related by r G = t h + Q(G, v). 
Thus 

Et g = Et h + EQ(G, v) < ~ ^ + (n - 1) = 

By the previous lemma, the last inequality is strict unless H is an n-lollipop and its vertex 
of degree 1 is the neighbor of v in G. This completes the induction. □ 

Proof of Theorem 1.1. The theorem follows immediately from combining Lemmas 2.2, 
2.3, 2.4, and 4.12. □ 

5. Further proofs 

The n-cycle and n-lollipop have the worst mean run times. Here we prove Proposi- 
tion 2.5, namely that the run time distributions are in fact identical. 

Proof of Proposition 2.5. Number the vertices of the n-lollipop 0, . . . , n — 1 with being 
the leaf. Always pop the sink with lowest number. Let Y k denote the sink popped at 
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time k. Clearly Y is — 1 plus a mean 2 geometric random variable, with the proviso that 
a value of n — 1 or higher represents the terminal state in which no sink needs to be 
popped. Let T k be the cx-field generated by Y , . . . , Y k _ 1 . We claim that {Y k : k > 0} is a 
time-homogeneous Markov chain with respect to {.Ffc} and that from any state j > its 
increments are —2 plus a mean 2 geometric, jumping to the terminal state if it reaches 
n — 1 or greater, and from state the same thing with —2 replaced by —1 (thus the 
jump from is resampled if it hits —1). All that is needed to check this is an inductive 
verification that the orientations of edges between vertices of higher index than Yj. are 
conditionally i.i.d. fair coin flips given jF fc , which is straightforward. 

Now we show that the running time on an n-cycle is also equal to the time for a 
random walk to hit at least n — 1 when its increments are —2 plus a mean 2 geometric, 
resampled if it hits — 1. At time k, let Y k denote the least index of a sink when the edge 
from n — 1 to is oriented toward and n — 1 minus the greatest index of a sink when the 
edge is oriented toward n — 1. In other words, this quantity is the distance from the head 
of the 0, n — 1 edge to the nearest sink in that direction. We always choose the pop that 
sink. The only time the 0, n — 1 edge can change orientations is when Y k = 0, in which 
case Y k+ i will be —1 plus a mean 2 geometric; when Y k > verification of the conditional 
increment is trivial. The stopping rule is, again, that one must jump to n — 1 or greater, 
and Yq has the right distribution for the same reason as before, so the sequence has the 
same distribution. □ 

Our final result deals with the run time started from an arbitrary state, that is, the 
conditional distribution of r given JF (as defined in Proposition 2.6). While this quantity 
is a hidden variable as far as users of the algorithm are concerned, it has relevance to the 
distribution of the run time, as well as having some intrinsic interest. We begin again 
with a result on the n-cycle. 

Proposition 5.13. Let G be an n-cycle. Then for every v e V(G), 

E(Q(G,v)\F ) < 3n/4, 

with equality if and only if n is even and all edges are oriented along the direction of 
shortest travel to v. 

Proof. Number the vertices 0, . . . , n — 1 mod n. We first establish that the discrete 
Laplacian of Ei(Q(G,v) \ Fq) depends on the initial orientation of G via 



E 



Q{G , v) _Q(G,v + l) + Q(G,v-l) 



fa 



1 if v is a sink, 

-1 if v is a source, and (5.1) 
otherwise. 



To see this, choose any popping order and let Y(v, k) denote the in-degree of v at time 
k, that is, the number of e G E{G) adjacent to v for which X e j( e ^ is oriented toward v. 
Then, conditionally on anything up to time k, 

ey(v, k + 1) = E y>, k) - P(v k = ,) + rfa = " + i) + P('"- = •"-!> , 
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since any pop at v reduces the expected in-degree by mean 1 and any pop at a neighbor 
of v increases it by mean 1/2. Summing over k, conditioning on Tq and using Y(v, r) = 1 
proves (5.1). 

From Proposition 4.10 we know that E(r | JF ) = 2Yq(ji — Yq) where Yq is the number 
of initial clockwise arrows (edges oriented from i + 1 to i mod n for some i) . This, along 
with (5.1), determines ~E(Q(G, •) | JF ), since the difference of any two candidates for this 
function would be a harmonic function on the cycle, and hence constant. 

In general, 

E(Q(G, v )\F ) = !i^ + 3n-2k-lj2 ' ( 5 ' 2 ) 

if there are k clockwise edges pointing from v + a 3 - to v + aj — 1 for a set {ai, . . . , a^} C 
{1, . . .n} (addition taken mod n). To prove this formula, we need only prove that the 
right hand side satisfies the two properties that characterize the left hand side. The sum 
over all v (i.e., E(r | JF )) is easy, since it equals 

fen 

k(l + 3n-2k)--J2J2^ 

n o=l i=l 

which does indeed simplify to 2k(n — k). To check that the right hand side of (5.2) works 
in (5.1), we proceed as follows. Let f(v) be the right hand side of (5.2). Then 

g (v) =f( v + 1) - f(v) = -- (-k + n5 min{a . }) i) , 



n 



where 5 is the Kronecker delta. Hence, 

g(v) -g(v-l 



1 if v is a sink, 
= ^ — 1 if v is a source, and 
otherwise, 



as desired. 

Equation (5.2) makes it easy to see when E(Q(G, v) \ JF ) is maximized: that can occur 
only when {ai,...,Ofc} = {l,...,k}, in which case E(Q(G, v ) | JF ) equals 3n(/c/n)(l — 
k/n). This quantity is bounded above by 3n/4, with equality if and only if n = 2k. □ 

Proof of Proposition 2.6. Induct again, as in the proof of Lemma 4.12 and the main theo- 
rem. Simultaneously, we show by induction that E(Q(G, v) \ JF ) < 2{n — 1), with equality 
only for the leaf of an n-lollipop and initial conditions X e o all pointing toward v. Note 
that the previous proposition proves this bound for all G G So, if we use monotonicity 
(which also holds conditioned on the initial orientation). 

For the base case, G G S and Proposition 5.13 shows that in fact E(r | JF ) < 3n 2 /4. 
There is strict inequality because equality in Proposition 5.13 cannot hold simultaneously 
for all vertices in a cycle. It follows that E(r | JF ) < n(n — 1) unless n < 3. The cases 
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with n < 3 are easily dealt with: those with n = 2 are trivial, and for n = 3 the worst 
case contains a 3-cycle, which can be analyzed using the sharper bounds in the proof of 
Proposition 5.13. 

When G is not connected or is the union of two graphs in S along an added isthmus, 
the n(n — 1) bound is immediate from subadditivity of n(n— 1) and monotonicity, and the 
2(n — 1) bound follows from monotonicity. Finally, when G has a leaf v, set H := G\{v } as 
before. This time, in the worst case we know that v is a sink initially, so E(Q(G, v) | JF ) is 
bounded by 2 + E(Q(H, w) | JF ). This verifies the conclusion that E(Q(G, v) | JF ) < 2{n — 
1), and adding this to Er# gives, by induction, at most (n— l)(n — 2) +2(n — 1) = n{n— 1), 
which completes the proof of the upper bound; the conditions for equality are clear from 
the proof. □ 

6. Questions 

It is tempting to view both cycle popping and sink popping as special cases of what 
might be called "partial rejection sampling:" to generate a random structure, choose 
a random candidate, and if it has any flaws, locally rerandomize until it is flawless. 
Does partial rejection sampling apply to other natural combinatorial problems? Can one 
develop a general theory? Note that Fill and Huber's randomness recycler [FH] also uses 
the idea of rejecting only part of a structure, although in a different way. 

Cycle popping was applied to the study of random spanning trees on Z d , as well as 
some more general graphs, in [BLPS]. It would be interesting if sink popping could be 
used similarly. Do random sink-free orientations on Z d exhibit any interesting or surprising 
structure? 
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